{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Bring Your Own Container\n",
    "\n",
    "Do this if you need to use GPUs, for example:\n",
    "\n",
    "* https://xgboost.readthedocs.io/en/latest/gpu/index.html"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install -q boto3\n",
    "!pip install -q xgboost"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import boto3\n",
    "import sagemaker\n",
    "import pandas as pd\n",
    "\n",
    "sess   = sagemaker.Session()\n",
    "bucket = sess.default_bucket()\n",
    "role = sagemaker.get_execution_role()\n",
    "region = boto3.Session().region_name\n",
    "\n",
    "sm = boto3.Session().client(service_name='sagemaker', region_name=region)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## TODO:  Create a custom container with XGBoost GPU installed\n",
    "See https://xgboost.readthedocs.io/en/latest/gpu/index.html"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you are using TensorFlow, PyTorch, or another SageMaker supported framework, the SageMaker Python SDK will automatically upload the model for you.\n",
    "\n",
    "However, if you need to manually upload your model to S3, then you should create a Docker image per the steps below._\n",
    "\n",
    "1. Build a custom container with XGBoost (`without train.py`)\n",
    "2. `pip install sagemaker-containers` \n",
    "3. Push to ECR.\n",
    "4. Create a tar file with your `train.py`\n",
    "5. Upload to S3\n",
    "6. Add the ECR `image_uri` to your `Estimator`\n",
    "7. Add the following 2 hyperparameters to your `Estimator`:\n",
    "```\n",
    "{'sagemaker_program': 'train.py',\n",
    "'sagemaker_submit_directory': 's3://bucket/prefix/training-scripts/sourcedir.tar.gz'}\n",
    "```\n",
    "8. Call `estimator.fit()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker.amazon.amazon_estimator import get_image_uri \n",
    "\n",
    "# get the URI for new container\n",
    "custom_container_uri = get_image_uri(region_name=region,                                \n",
    "                                     repo_name='xgboost', \n",
    "                                     repo_version='0.90-2')\n",
    "\n",
    "model_output_path = 's3://{}/models/amazon-reviews/xgboost-byoc'.format(bucket)\n",
    "\n",
    "# TODO:  Figure out `requirements.txt`\n",
    "xgb_estimator = sagemaker.estimator.Estimator(image_name=builtin_container_uri, \n",
    "                                              role=role, \n",
    "                                              hyperparameters={'objective':'binary:logistic',\n",
    "                                                               'num_round': 1,\n",
    "                                                               'max_depth': 5,\n",
    "                                                               'sagemaker_program': 'xgboost_reviews.py',\n",
    "                                                               'sagemaker_submit_directory': 's3://bucket/prefix/training-scripts/sourcedir.tar.gz'},\n",
    "                                              train_instance_count=1, \n",
    "                                              train_instance_type='ml.m4.xlarge', \n",
    "                                              output_path=model_output_path, \n",
    "                                              sagemaker_session=sess,\n",
    "                                              enable_cloudwatch_metrics=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "xgb_estimator.fit({'train': s3_input_train_data,\n",
    "                   'validation': s3_input_validation_data\n",
    "                  }\n",
    "                  #, wait=False\n",
    "                 )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "training_job_name = xgb_estimator.latest_training_job.name\n",
    "print('training_job_name:  {}'.format(training_job_name))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# download the model artifact from AWS S3\n",
    "!aws s3 cp $model_output_path/$training_job_name/output/model.tar.gz ./models/byoc/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tarfile\n",
    "import pickle as pkl\n",
    "\n",
    "# TODO:  extract to ./model/built-in/\n",
    "\n",
    "#opens the downloaded model artifcat and loads it as 'model' variable\n",
    "tar = tarfile.open('./models/byoc/model.tar.gz')\n",
    "tar.extractall(path='./models/byoc/')\n",
    "tar.close()\n",
    "model = pkl.load(open('xgboost-model', 'rb'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!ls -al ./models/byoc/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "predictions, raw_outputs = bert_model.predict([\"\"\"Very funny. A typical mid 50's comedy.\"\"\"])\n",
    "print('Predictions: {}'.format(predictions))\n",
    "print('Raw outputs: {}'.format(raw_outputs))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "predictions, raw_outputs = bert_model.predict([\"\"\"That movie was absolutely awful.\"\"\"])\n",
    "print('Predictions: {}'.format(predictions))\n",
    "print('Raw outputs: {}'.format(raw_outputs))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_python3",
   "language": "python",
   "name": "conda_python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
